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Abstract — Energy harvesting (EH) has recently emerged as 
an effective way to solve the lifetime challenge of wireless 
sensor networks, as it can continuously harvest energy from 
the environment. Unfortunately, it is challenging to guarantee 
a satisfactory short-term performance in EH communication 
systems because the harvested energy is sporadic. In this paper, 
we consider the channel training optimization problem in EH 
communication systems, i.e., how to obtain accurate channel 
state information to improve the communication performance. In 
contrast to conventional communication systems, the optimization 
of the training power and training period in EH communication 
systems is a coupled problem, which makes such optimization 
very challenging. We shall formulate the optimal training design 
problem for EH communication systems, and propose two solu- 
tions that adaptively adjust the training period and power based 
on either the instantaneous energy profile or the average energy 
harvesting rate. Numerical and simulation results will show 
that training optimization is important in EH communication 
systems. In particular, it will be shown that for short block 
lengths, training optimization is critical. In contrast, for long 
block lengths, the optimal training period is not too sensitive to 
the value of the block length nor to the energy profile. Therefore, 
a properly selected fixed training period value can be used. 

I. Introduction 

In traditional wireless sensor networks, the limited energy at 
each node constrains the network lifetime. Energy harvesting 
(EH) is a promising technology which has the potential to 
provide a powerful solution to achieve perpetual lifetime 
without requiring external power cables or periodic battery 
replacement QT) . Energy harvesting nodes can harvest energy 
from the environment, including solar energy, vibration energy, 
thermoelectric energy, RF energy, etc. With its highly self- 
reliance capability, EH will undoubtedly play an important 
role in future green communication networks. 

However, employing energy harvesting nodes poses new 
challenges related to the link and network design, as the 
harvested energy is typically small and random. Thus although 
EH technology improves the long-term performance, the chal- 
lenging short-term performance needs to be guaranteed. Pre- 
vious works on EH networks have developed communication 
protocols to either maximize the throughput or minimize the 
transmission completion time, assuming perfect channel state 
information (CSI) at the transmitter and receiver, e.g., 0, 
101 , (4). In (3), a directional water- filling (DWF) algorithm 
is proposed to solve the transmit power allocation problem 
in EH systems, while in J4], a generalized DWF algorithm is 
proposed to solve a general utility maximization problem. 



In a wireless communication link, CSI is important, e.g., 
for the receiver to decode the transmitted message, or for rate 
adaptation at the transmitter. At the receiver side, CSI can be 
obtained by sending pilot symbols from the transmitter. There 
exists a ttadeoff between the training overhead and the training 
performance. Specifically, spending too much energy or time 
on channel training will reduce the energy or time for data 
transmission. On the other hand, training with too little energy 
or time will degrade the estimation performance. To maxi- 
mize the throughput, the training period and training power 
should be carefully selected. Previous studies have shown 
that in conventional communication systems, training power 
optimization and training period optimization are decoupled, 
of which the power optimization is more important. In 0, 
it was shown that for a point-to-point link without the peak 
power constraint, the optimal training policy involves sending 
one pilot symbol with optimized training power. However, in 
EH communication systems, the training design is different 
and is largely influenced by the low rate and randomness 
property of the available energy. The selection of the training 
period and training power in EH systems are coupled and both 
will depend on the EH profile in the communication block. 
Therefore, the training design in EH communication systems 
is more challenging and plays a more important role. 

In this paper, we investigate the training optimization prob- 
lem in EH communication systems. We first characterize the 
properties of the training design in an EH communication 
system. We then propose two different training policies to 
determine the training period and power. The first training 
policy adaptively adjusts the training period based on the 
energy profile in the whole transmission block, while the 
second one is designed in an adaptive way according to 
the average EH rate of the block. Simulation results will 
show that training optimization is important to improve the 
communication performance in EH systems, especially when 
the transmission block is not very long. For long block lengths, 
the optimal training period is not too sensitive to the value of 
the block length. Therefore, a fixed training period value can 
be used if properly selected. 

II. System Model 

We consider a point-to-point communication link where 
the transmitter is an EH node, as shown in Figure 1. The 
transmitter can only use the energy it harvests, and we assume 
that all the harvested energy is used for communication. 
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Figure 1. The basic system model. 



The channel is characterized by block fading, and within a 
coherence block, the channel gain h is constant with h ~ 
CN (0, OfA. The additive white Gaussian noise is denoted 
as n with n ~ CN(0,ct 2 ). The communication within one 
transmission block includes two stages: the training stage and 
the data transmission stage. The partition of the two stages 
is in the unit of a time slot Tg. The fading block length is 



denoted as T, with N 



slots, while the training stage 



length is T t , with N t = ip- slots. During the training stage, 
the receiver obtains an estimate of h, denoted as h, through the 
use of a pilot signal. The estimation error is denoted as h with 
h = h — h. Before the transmission stage, the receiver feeds 
back the value of h to the transmitter. The feedback channel 
is assumed to be perfect, while the case with unideal feedback 
will be discussed in future work. 

A. Energy Model 

An important factor that determines the performance of an 
EH system is the EH profile, which models the variation of 
the harvested energy with time. Several different types of EH 
profiles are shown in the left part of Figure 2. For convenience, 
we plot all EH profiles inside a 2-D coordinate system of 
accumulated energy versus time. 

To demonstrate the property and impact of energy profiles, 
we adopt similar EH assumptions as in 0, (3). Specifically, 
we assume that the energy profile in the considered transmis- 
sion block is known before the communication starts. This 
assumption is applicable for predictable energy models, such 
as solar energy J6). 

The utilization of the harvested energy is constrained by 
the EH profile, and therefore the energy neutrality constraint 
exists in EH systems Q. The energy neutrality means that 
the energy consumed thus far cannot exceed the total energy 
harvested. For simplicity, we assume that the EH node can 
only use the energy harvested in the previous slots. If the 
consumed power is denoted as P (t), the initial energy in the 
energy buffer as Eq, and the harvested energy in the fcth slot 
as Ek, then the energy neutrality constraints can be expressed 
as 
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where I is the index of the time slot with I = 1.2, 
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Figure 2. The energy profile and feasible energy consumption policies in the 
2-D coordinate system of accumulated energy versus time. The left part plots 
energy profiles, for two general EH cases, and two special cases: the non-EH 
case and the constant-rate EH case. The right part plots the feasible energy 
consumption domain and policies for a given EH profile, of which the bold 
line is the EH profile, the area under it is the feasible domain, and a curve 
connecting the bottom-left point and the top-right point inside this domain is 
a feasible energy consumption policy, two examples of which are shown. 



As shown in (1), a certain EH profile determines a feasible 
energy consumption domain, and only the policies inside this 
domain are feasible energy consumption policies, both of 
which are plotted in the same coordinate system with the 
EH profile in the right part of Figure 2. Due to the energy 
neutrality constraint (1), we cannot use the energy arriving 
in the future, but can back up the current energy for future 
use. This causal energy constraint determines the directional 
property of all power allocation policies in EH systems, which 
will be discussed in more detail later. 

Among all kinds of EH processes, there are two special 
cases: the non-EH case and the constant-rate EH case, as 
shown in the left part of Figure 2. Here we treat the con- 
ventional non-EH system, i.e., without the EH function and 
only with the average power constraint, as an extreme case 
of energy harvesting, in which all the energy arrives before 
the first slot. This is equivalent to relaxing all the causal 
energy constraints. The feasible energy domain of non-EH 
nodes is the union of all the possible EH profiles with the 
same total energy in a given time duration, so it provides the 
best performance among all the EH profiles. Constant-rate EH 
refers to the node that can harvest energy at a constant rate. 
In this case, the profile can be considered as a deterministic 
process. In practical systems, when the EH profile does not 
change frequently or the block length is small, a constant-rate 
EH profile is a good approximation of the energy profile in 
each transmission block, with the mean of the EH process as 
its harvesting rate. 

The battery capacity is also an important factor for the EH 
link performance besides the EH profile. In this paper we 
assume that the energy buffer is of an infinite capacity, while 
the case with a finite buffer capacity will be dealt with in 
future work. 

III. Impact of Channel Training in EH systems 

In this section, we first investigate the training policy for 
EH systems and compare it with non-EH systems. We will 
then develop power allocation for the data transmission stage. 
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Figure 3. Comparison of the power allocation policies in the training stage 
for non-EH and EH systems. The left and right figures represent the non-EH 
and EH cases, respectively. 



A. Training Stage in EH Systems 

In the training stage, we denote the average training power 
in the jth time slot as Pj (1 < j < Nt), then the variance of 
the estimation error with an MMSE channel estimator is [ 8 1 
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We see that only the sum of average training powers matters. 
This means that as long as the total training power is the same, 
the training performance is fully determined, independent of 
the training period or the power allocation inside this stage. 
Thus, we will use the discrete-time expression of Pj = Pj to 
denote training powers. 

Due to the causal energy constraint in EH systems, there 
exists a big difference in the training design for non-EH 
systems and EH systems. In the non-EH system without a 
peak power constraint, the optimal Nt is always 1 13, as 
shown in the left part of Figure 3. An intuitive explanation 
is that we can always achieve a good training performance 
with enough training power (as long as it is less than the 
total power available). Meanwhile, we shall make the training 
period as small as one time slot. Thus, what matters is the 
power allocated for channel training rather than the training 
period. However, this is not the case for the EH system. 
Due to the stochastic EH profile, the energy arrival in the 
first time slot may be very small, as shown in the right part 
of Figure 3. Hence, fixing the training period as 1 slot will 
generally provide an inaccurate channel estimation. The total 
training power is largely determined by the training period, 
which makes it more important than the power allocation, and 
increases the difficulty of the training design. 

In EH systems, we select such a training power allocation 
policy that, for a given Nt, all the harvested energy for 1 < 
3 < N t — 1 is exhausted, while there may be some energy left 
at slot Nt, of which the value is optimized. This is optimal 
because it is not possible to find a smaller training period 
N' t < Nt to achieve the same training performance. 

B. Data Transmission Stage in EH Systems with Estimation 
Errors 

Considering the channel estimation error and the training 
overhead, the average achievable throughput in each time slot 
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As shown in 0, this is a lower bound for the capacity with 
channel estimation error and we will use it as the performance 
metric in the paper. 

By substituting (3) and adopting (9) in |10|, this rate 
expression can be finally transformed to 
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Different from non-EH systems that use a constant transmit 
power in the data transmission stage, in the EH system, we 
need to determine the power allocation between different time 
slots, as the power allocated to each slot needs to satisfy the 
energy neutrality constraint (1). For given Nt, h and cr|, the 
power allocation problem is as follows: 



Problem 1: 
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where E te denotes the energy left from the training operation, 
and is known before the optimization. 

In Problem 1, the first constraint is the energy neutrality 
constraint. In contrast to non-EH systems, even if the chan- 
nel stays unchanged, the power still needs to be adaptively 
allocated from slot to slot due to the causal EH constraints. 
The second constraint means that at the end of the block, 
the node needs to use up all the available energy, as we do 
not consider the energy sharing between blocks to render our 
problem tractable, while the case with block-to-block energy 
sharing will be discussed in future work. 

We make the following two comments on Problem 1 . First, 
similar to the training power, the data transmission power is 
expressed in a discrete-time form, as it is optimal for the 
power inside one slot to be constant due to the concavity of 
the objective function. Second, as the training power and the 
data transmission power are the same from the perspective of 
energy consumption, we use the same notation P and only 
distinguish between them by the time index. 

The throughput expression with estimation errors satisfies 
the condition of the directional water-filling (DWF) algorithm 
flU, and thus the optimal power allocation follows DWF. Such 
a DWF algorithm has a special property that the solution is 
only determined by constraints, irrespective of the parameters 



Algorithm 1 DWF algorithm for Problem 1 with different N t 



1) Initialization: Set integers fco = and n — 1. 



2) Iteration: Iterate k n — arg min 

k:k<N 



k-k n 



with n 



adding 1 each time, until k n = N, so finally an index 
set /Co = {k n } is constructed. 

3) Results for iV t =0: The optimal power in the ith slot is 

Pn = k n -"k n -i for i € [k n -i + 1, fc„], and a power 
set Po = is obtained for N t =0. 

4) Update for AT t ^ 0: Reset fc = N t , recalculate 

f E,t^ ^ 1 

fci = arg min < — r— ft — then the index set for N* 

k:k<N { fc_fc o J 

is /Cjv t = {fc^} U {all k n e /C that k n > k[}. 

5) Results for N t ^ 0: The power for i e [fc + 



is p\ 



, t^tt - , while the other powers are un- 
changed, then the power set for Nt is 7-V t = {Pi} U 
{all p„ e "Po that p„ > pi}. 



in the objective function. So in our problem the solution is 
independent of h and <r?, i.e., the estimation performance and 
the value of the estimated channel gain do not have any impact 
on the power allocation. This special property can largely 
simplify the power allocation, as there is no need to completely 
reallocate the data transmission power for different values of 
N t . We only need to execute the power allocation over the 
whole block once for N t — 0, and update a few points for other 
values of Nt. Accordingly, we develop an efficient algorithm 
to solve Problem 1 for different Nt, as shown in Algorithm 1. 

From Algorithm 1, we can see that for a given N t , the 
power allocation result consists of several intervals, and the 
power is a constant value inside each of these intervals. The 
endpoint indices of all intervals form an index set /C;v t , while 
the powers in these intervals form a power allocation set 
VNf Furthermore, according to Steps 4 and 5 of Algorithm 
1, for different N t , the majority (i € [k[, N]) of the transmit 
power allocation is unvaried, while only a small proportion 
(i € [fco + Ijfc'i]) changes with N t . This property brings the 
possibility of decoupling the training power allocation and the 
selection of training period, which will be used in the next 
section for the optimal training design. 

IV. Optimal Training Design 

As seen from the last section, in EH communication sys- 
tems, the coupling of the training period selection and the 
training power allocation brings the main difficulty in the 
training design, and the training period selection is especially 
critical. In this section, we will investigate the optimal training 
design in EH systems and propose two training policies. 

A. Problem Formulation 

With the average throughput in (4) as the objective and 
considering energy neutrality constraints, the optimal training 
problem in EH communication systems is formulated as 
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This problem has two difficulties: 1) the optimization of N t 
and Pi are coupled; 2) the optimization variable 7V t exists in 
the summation limit in the objective function, and only takes 
discrete values. Due to the intractability of this problem, we 
propose a sub-optimal solution in the next subsection. 

B. A Sub-optimal Solution 

Due to the difficulties of Problem 2, we adopt a DWF 
approximation and a rate approximation to derive a sub- 
optimal solution. Both of these simplifications have good 
approximation properties, which will be verified by the simula- 
tion results. In addition, for the special case of the constant-rate 
EH, both approximations become equivalent to the original 
problem. As commented in Section II, the constant-rate EH 
model is a good approximation for the energy profile in each 
transmission block in different EH systems, so our sub-optimal 
solution will be in general close to optimal. 

1) DWF Approximation: First, based on the property of 
Algorithm 1 as discussed in Section IH.B, we make an 
approximation to decouple the training power allocation and 
the training period selection. From Algorithm 1, the power 
allocation in the whole transmission block only changes in a 
small number of slots for different values of Nt. We make 
an approximation that the power allocation is fixed for all 
values of Nt, i.e., we ignore the possible changes of power 
allocation in some slots for different N t . This simplification 
will decouple N t and P i7 so that we can perform the DWF 
power allocation just once, and then optimize Nt over a fixed 
power allocation result. In this way, we can get a sub-optimal 
solution with low computational complexity. 

2) Rate Approximation: With the DWF approximation, the 
problem is still intractable, as the variable Nt only takes 
integer values and appears in the summation of the objec- 
tive function. To further simplify the problem, we make 
the following rate approximation: first, for a given value 
of Nt, we calculate the estimation error assuming a con- 
stant training power to equal the average EH rate Ph, i.e., 

of = — — % ~ 2 t-ttt d ; second, we determine 

h <y 2 +°lY.%Pi <y 2 +<ylN t P„> 

the achievable throughput R assuming all the slots including 
the training period are used for data transmission with the 
transmit power equal to the DWF result in the respective slot, 

i.e., R = Yh=i M i> where M i = CX P (3?-) E i (l?-) is the 
average throughput for the ith slot considering the estimation 
error; finally, we include the throughput loss due to the training 
period, i.e., R = ^ =Nt+1 M t « ^ £ti M t = ^R. 
To summarize, Step 1 is to consider the effect of the estimation 



error, Step 2 is adopting the DWF approximation while ignor- 
ing the time taken by training, and Step 3 is to take the time 
consumed by training back into consideration. While greatly 
simplifying the problem, this rate approximation preserves the 
essential tradeoff in the original training design problem, i.e., 
the tradeoff between the resource consumed by training and 
the estimation performance. 

3) Solution to the Simplified Problem : Based on previous 
two steps, the training design problem can be formulated as: 
Problem 3: 
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, and all Pi are determined 



through Step 1-3 of Algorithm 1. The solution for Problem 3 
is a sub-optimal solution for Problem 2. 

For simplicity, we denote x = When x is assumed con- 
tinuous, the objective function is concave, the proof of which 
is omitted due to space limitation. Through the derivative with 
respect to x we can get an implicit solution, i.e., the solution 
of Problem 3 is the solution of the following equation (the 
discretization part is omitted due to space limitation) 
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(5) 

where d = ° h 2 r ""; . 

When N approaches infinity, we can get an asymptotic 
solution of Nt and its ratio over TV in closed form as 
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From the expression of solution (6), we see that the optimal 
training period is influenced by the block length and EH pro- 
files. Generally speaking, a larger block length will result in a 
longer training period N t , but a smaller training period ratio 



a = jt-. When N approaches infinity, N t also approaches 
infinity, while the ratio a approaches zero. 

C. A Special Case - The Constant-rate EH Profile 

The constant-rate EH process can be used to approximate 
any general EH system when the energy harvesting rate does 
not change intensively. Thus, in this section we will show that 
the optimal solution of the constant-rate EH case can provide 
another sub-optimal solution for the general EH systems with 
the same average EH rate. This solution is very practical as it 
only needs the mean value of the EH profile, rather than its 
instantaneous realization. 

To optimize N t for the constant-rate EH case, the gradient 
analysis of throughput shows that the optimal value for both 
the training and transmission powers equals the EH rate, 
denoted by P#, i.e., the transmit power is a constant in both 
stages, and we only need to determine Nt, the training period. 



By applying (5) to the constant-rate EH case, the optimal 
N t for the constant-rate EH is the solution of 
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Note that (7) is the exact optimal solution for a constant- 
rate EH system. Meanwhile, it also provides an approximate 
solution for a general EH communication system. Thus, we 
propose a second sub-optimal solution to Problem 2 as follows. 
First, we equate the total energy in a given transmission block 
of a general EH system with that of a constant-rate EH system, 
from which we can get an equivalent Pjj, the average EH 
rate. Then, the approximate solution is derived by solving the 
training optimization problem for a constant-rate EH system 
with rate Pjj. This provides a sub-optimal value of Nt. Once 
we get this value of Nt, the directional water- filling algorithm 
can be applied for power allocation in the data transmission 
stage to further improve the performance. 

V. Simulation Results 

In this section, we provide simulation results to show the 
importance of training optimization in EH communication 
systems. We will compare the throughput performances of 
the optimal policy, two sub-optimal policies, and several fixed 
training policies. The result of the optimal policy, i.e., the 
solution of Problem 2, is obtained by exhaustive search. The 
sub-optimal solutions include (5) as sub-optimal solution 1, 
and (7) as sub-optimal solution 2. The fixed training policies 
include: fixing a training period value Nt, fixing the training 
period ratio ^r, and the conventional fixing 1-slot policy, i.e., 
Nt = 1. These training policies will also be compared with 
two performance upper bounds. One upper bound assumes 
perfect CSI and with the same EH process. It will be denoted 
as "upper bound 1". The second is the non-EH case with the 
same total energy in each transmission block and adopting the 
optimal channel training in Q. We shall denote it as "upper 
bound 2". 

In the simulation, we assume that the channel is distributed 
as h ~ CN(0, 1). Both the energy arrival process in each time 
slot and the initial energy in the energy buffer are assumed 
to be Poisson distributed, with parameter A e set to be 1. The 
average SNR is also l.The simulation is run for 1000 random 
EH realizations. We select Nt — 30 for the fixed training 
period scheme, and Nt — 0.047V for the fixed training period 
ratio scheme. The results are shown in Figure 4. 

We see that the optimal policy and two sub-optimal policies 
are very close to each other, and all have small gaps to 
the performance bounds. The achievable throughput of sub- 
optimal solution 2 is slightly lower than that of sub-optimal 
solution 1. It should be emphasized that the sub-optimal 
solution 2 does not need the instantaneous realization of the 
EH process, but only the average energy harvesting rate of the 
process, which makes it more practical. We can also find that 
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Figure 4. The comparison of different training schemes for various block 
lengths assuming a Poisson EH process. 

when N is small, the gaps between all the fixed policies and 
the optimal one are very large, which means that we need to 
adaptively adjust the training period for different EH profiles. 
However, when N is large, the throughput gaps between the 
fixed policies and the optimal one are not very big, except 
for Nt = 1. This means that in a low mobility environment, 
i.e., with a large N, it is feasible to select a fixed training 
period or ratio not only independent of the EH process, but 
also independent of the block length. 

Next, we elaborate more on the fixed N t policy, as we 
cannot change N t in some practical systems. Considering 
typical values of coherence bandwidth W c = 500kHz and 
coherence time T c — 2.5ms (from fiD ) as an example, the 
block length is N = 1250. Figure 5 compares the throughput 
of the optimal policy with adaptively selected Nt and the fixed 
policy with different fixed values of Nt- We can see that when 
Nt lies in the interval [13, 128], the performance gap between 
the fixed policy and optimal policy is within 5%; while the 
interval for a 10% gap is [8, 189]. This indicates that as long 
as Nt belongs to a proper region, it is a fairly good policy to 
fix N t independent of the instantaneous EH process. 

From these results, we see that if the training period can be 
adaptively adjusted in each block, we can get the approximate 
optimal value using the sub-optimal solution 2 in (7). On 
the other hand, if the training period needs to be fixed, we 
can select a proper fixed value according to the throughput 
gap requirement, which will work well especially for the low 
mobility environment. 

VI. Conclusions 

In this paper, we investigated the optimal training design 
for EH communication systems, which was shown to be 
quite different from conventional non-EH systems and poses 
new challenges. We found that the training period should be 
carefully selected, especially when the coherence block length 
is not very long. In particular, we proposed two sub-optimal 
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Figure 5. Performance of the optimal policy and the fixed policy for TV = 
1250 with a Poisson EH process. The optimal policy adaptively picks a value 
of Nt for different energy profiles, while the fixed policy always chooses a 
single Nt. 

training policies to determine the training period and power, 
the second of which is especially attractive as it only requires 
information about the average EH rate instead of the detailed 
energy profile in each transmission block. Furthermore, we 
demonstrated that in low mobility environments, a carefully 
selected fixed training period can provide satisfactory per- 
formance, which provides a practical option for systems that 
cannot adaptively adjust the training period. 
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